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Abstract. If E is an elliptic curve defined over Q and p is a prime of good reduction for E, 
let E(¥ p ) denote the set of points on the reduced curve modulo p. Define an arithmetic function 
M E {N) by setting M E {N) := #E(W P ) = N}. Recently, David and the third author studied 

the average of Me{N) over certain "boxes" of elliptic curves E. Assuming a plausible conjecture 
about primes in short intervals, they showed the following: for odd N, the average of Me(N) over 
a box with sufficiently large sides is ~ i^rffi f° r an explicitly-given function K*(N). 

The function K* (N) is somewhat peculiar: defined as a product over the primes dividing N, it 
resembles a multiplicative function at first glance. But further inspection reveals that it is not, and 
so one cannot directly investigate its properties by the usual tools of multiplicative number theory. 
In this paper, we overcome these difficulties and prove a number of statistical results about A"* (N). 
For example, we determine the mean value of K* (N) over odd N and over prime N, and we show 
that K* (N) has a distribution function. We also explain how our results relate to existing theorems 
and conjectures on the multiplicative properties of #E(F p ), such as Koblitz's conjecture. 



1. Introduction 

Let E be an elliptic curve defined over the field Q of rational numbers. For the sake of concrete- 
ness, we assume that the affine points of E are given by a Weierstrass equation of the form 

E : Y 2 = X 3 + aX + b, (1) 

where a and b are integers satisfying the condition — 16(4a 3 + 27b 2 ) ^ 0. For any prime p where 
E has good reduction, we let E(¥ p ) denote the group of F p -points on the reduced curve. In [11], 
Kowalski introduced the arithmetic function M E (N), defined by 

M E (N) = #{p prime: #£(F P ) = N}. 
The Hasse bound [8] implies that if p is counted by M E (N), then p lies between (v~N — l) 2 and 



'N + l) 2 . Thus, M E (N) is a well-defined (finite) integer. 
The problem of obtaining good estimates for M E (N) appears to be a very difficult problem. 
The condition imposed by Hasse's bound together with an upper bound sieve gives the weak upper 
bound M E (N) y/N / log(iV + 1) for any N > 1. Except in the case that E has complex 
multiplication, nothing stronger is known. As we will explain later, the average value of M E (N) 
as N varies over various sets of integers is related to some important theorems and conjectures 
in number theory. In [4], David and the third author established an "average value theorem" for 
M E (N) as E varies over a family of elliptic curves. Unfortunately, because of the restriction that 
all primes counted by M E (N) lie between (yN — l) 2 and (y/N + l) 2 , the result is necessarily 
conditional upon a conjecture about the distribution of primes in short intervals (see Conjecture 1 .5 
below). 
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The main result of [4] introduced a rather bizarre arithmetic function, which was called K(N) 
because it is "almost a constant". In order to define K(N), we recall the common notation v p {n) 
for the exact power of p that divides n, so that n = Y[ p p Up ^ . We also recall the Kronecker symbol 
(!) , an extension of the Jacobi symbol that is defined for all integers a and b ^ (see, for instance, 
[3, Definition 1.4.8, page 28]). 

Definition 1.1. For any positive integer N, we define 

K(N) ^U^~ (p- i ) 2 (p+ 1 )) J|J {^p^hp-i)) n ( i_ P ^)i(p-i))' 

2fi/j,(AT) 2|i/ p (JV) 

where iV p = N/p u ^ N \ We also define K*(N) = K(N)N/<j>(N), where 0(iV) is the usual Euler 
totient function. 

As we will see later, it is actually the function K* (N) that has an interesting connection to the 
function Me(N). The purpose of the present work is a statistical study of the function K*(N). 
Our computations will illustrate a technique for dealing with arithmetic functions that have a form 
similar to, but are not exactly, multiplicative functions. Our first main result is the computation of 
the average value of K* on odd integers N (our motivation for considering only odd integers will 
be explained after Proposition 1.6 below). 

Theorem 1.2. The average value of K*(N) on odd numbers N is |. More precisely, for x > 2, 

H K ^ N ) = l + {\ X 



N<x 
N odd 



log a; 



Our second main result is the computation of the average value of K* on primes. We employ 
the usual notation n(x) = < x : p is prime}. 

Theorem 1.3. The average value of K*{p) on primes p is |C2 J, where 

<*=n(i-<j^) 

and 



p>2 

More precisely, given any A > 1, we have 

^F W = fC 2 J,(x) + oi-^ I ) (4) 

for x > 2. Furthermore, the same asymptotic formula holds for J2 p<x K(p). 



Remark. We have written C 2 and J as two separate constants because C 2 arises naturally by itself 
in the analysis of the function K(N) (see equation (5)). 
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The technique we use to establish Theorems 1.2 and 1.3, which is dictated by the unusual Defi- 
nition 1.1 for K(N), is of interest in its own right: the function K looks much like a multiplicative 
function but actually is not. One can rewrite Definition 1.1 in the following form: when N is odd, 

K(N) = jC 2 F(N - l)G(N) (5) 

where C 2 is the twin primes constant defined in equation (2), 

^Hnfi-^)~'(i- (r-i4> + ^ ' <6) 



p\n 
p>2 



and 



_ f n/p a \ 



p ||n p ||n 

a odd a even 

The function G{n) would be multiplicative were it not for the presence of the term (^-), 
which depends on the residue class of n (modp Q+1 ). The function F is truly multiplicative, but 
it is being evaluated at N — 1, not N, in the formula (5). Therefore we are forced to deal with 
the correlation between one multiplicative function and another that is not quite multiplicative. 
Given the definitions of F and G, it is rather startling (to say the least) that the average value of 
C2-F(iV — l)G(N) over odd numbers N is exactly equal to |. 

The fact that we can successfully compute average values of the function K*, even though it is 
not truly multiplicative, makes it natural to wonder whether we can analyze K* in other ways; this 
is indeed the case. Our next result is an analogue for K* (N) of a classical result of Schoenberg 
[14] for the function n/(p(n). Recall that a distribution function D{u) is a nondecreasing, right- 
continuous function D : R — > [0, 1] for which lim^.oo D(u) = and Hindoo D(u) = 1. 

Theorem 1.4. The function K* possesses a distribution function relative to the set of odd natural 
numbers N. In other words, there exists a distribution function D{u) with the property that at each 
of its points of continuity, 

D{u) = lim ^r-#{iV < x, N odd: K*(N) < u}. 

As a consequence of Theorems 1.2 and 1.3, we are able to show that the main result of [4] 
is consistent with various unconditional results. As mentioned above, the restriction imposed by 
the Hasse bound creates a short-interval problem in any study of M E (N) when N is held fixed. 
Indeed, the interval is so short that not even the Riemann hypothesis is any help. This problem 
is circumvented in [4] by assuming a conjecture in the spirit of the classical Barban-Davenport- 
Halberstam theorem. 

Conjecture 1.5. Recall the notation 9(x; q, a) = J2 P < X P =a(modq) 1°SP- Let < r\ < 1 and (3 > 
be real numbers. Suppose that X, Y, and Q are positive real numbers satisfying X v < Y < X 
and Q <Y. Then 



E E 

q<Q (a,q) = 



0(X + Y;q,a)-9(X;q,a)- 



Y 

<^ EQlogX + 



(log*) 



Remark. We remark that Languasco, Perelli, and Zaccagnini [12] have established Conjecture 1.5 
in the range rj > ^; they also showed, assuming the generalized Riemann hypothesis, that any 
rj > | is admissible. 

Given integers a and b satisfying — 16(4a 3 + 27b 2 ) ^ 0, let E at b denote the elliptic curve given 
by the Weierstrass equation (1). Then, given positive parameters A and B, let S(A, B) denote the 
set defined by 

S(A, B) = {E atb : \a\ < A, \b\ < B, -16(4a 3 + 27b 2 ) ^ 0} 

In [4], David and the third author established the following average value theorem (in fact a 
stronger version of it) for M E (N) taken over the family S(A, B). 

Proposition 1.6. Assume the Barban-Davenport-Halberstam estimate (Conjecture 1.5) holds for 
some r\ < \. Let N be a positive odd integer, let e be a positive real number, and let A > N 1 ^ 6 
and B > N l / 2+£ be real numbers satisfying AB > N 3 ^ 2+e . Then for any positive real number R, 



#£(A, B) EeS ^ A B ^ 
Remarks. 

(1) It is not necessary to assume that Conjecture 1.5 holds for a fixed rj < 1/2. It is enough to 
assume that it holds for Y = \[X / '(log X) fi+2 . 

(2) Proposition 1.6 should hold also for even integers N, though perhaps with some modifica- 
tion of the arithmetic function K(N) at even arguments. 

We note, as in [11], that computing the average value of M E (N) over the integers N < x is 
easily seen to be equivalent to the prime number theorem. In particular, 

J2M E (N)= £ #{N<x:#E(F p )=N} = n(x) + 0(^). (8) 

N<x P<(V^+l) 2 

Similarly, the average value of M E (N) taken over the integers N < x that satisfy a congruence 
condition is equivalent to an appropriate application of the Chebotarev density theorem. For ex- 
ample, if the 2-division field of E is an ^-extension of Q, then the Chebotarev density theorem 
implies that 

E, , 1 x 
Me(N) ~ . 

31 °s x 

N odd 

(The calculation of the constant | reduces to the fact that two thirds of the elements of GL 2 (Z/2Z), 
which is the automorphism group of E[2], have even trace.) If E is given by the Weierstrass 
equation (1), the 2-division field is easily seen to be the splitting field of the polynomial X 3 + 
aX + b. Since almost all cubics (when ordered by height) have S3 as their Galois groups, it seems 
reasonable to conjecture that 

^ek^E E M *W = 3i^ + ((I^)' (9) 

~ v ' ' N<x E£S(A,B) to \V to 7 / 

N odd 

provided that A and B are growing fast enough with respect to x. A precise version of this con- 
jecture was established by Banks and Shparlinski [2, Theorem 19]. (In fact, their theorem shows 
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that an analogous estimate holds with the condition "N odd" replaced by "m \ N", for any given 
integer m.) The asymptotic result (9), together with Theorem 1.2, shows that if we average the 
two sides of the equation in Proposition 1.6, we obtain consistent results (unconditionally). We 
can therefore, if we wish, view Theorem 1.2 as additional evidence for the conclusion of Proposi- 
tion 1.6. 

The techniques we use to establish Theorem 1 .2 also show that the average value of K* (TV) over 
all integers (odd and even) equals I. At the moment we do not know that Proposition 1.6 holds 
when N is even; if we did, however, we could then infer the asymptotic formula 



V V M E {N) = -^ + o(—^-\. 
ife^m *>gx \{\ogxf) 



#£(A,B) 

~ v ' ' N<xEe£(A,B) 

for the double average, which is consistent with equation (8). 

A similar problem arises if we consider only primes p instead of all odd integers N. Computing 
the average value of M E (p) over the primes p < x is easily seen to be equivalent to the famous 
Koblitz conjecture [10]: 

Conjecture 1.7 (Koblitz). Given an elliptic curve E defined over the rational field Q, there exists 
a constant C(E) with the property that as x — > oo, 

J2m e (p)~c { e)^. 

p<X \ o / 

p prime 

The constant C(E) appearing in Koblitz's conjecture may be zero, in which case the asymptotic 
is interpreted to mean that there are only finitely many primes p such that M E {p) > 0. An obvious 
obstruction to there being infinitely many primes with M E (p) > is for E to be isogenous to a 
curve possessing nontrivial rational torsion. It was once thought that this was the only case when 
C(E) = 0, but this turned out to be false; see [16, Section 1.1] for an explicit counterexample due 
to Nathan Jones. 

The main theorem of [1] may be reinterpreted to say that the asymptotic formula 



^ v ' ' p<x Ee£(A,B) 
p prime 



(10) 



\C 2 J-z^-r= + ' 



3 (log a;) 2 \(loga;) 3 / 

holds unconditionally for A and B growing fast enough with respect to x. Jones [9] has averaged 
the explicit formula for C(E) over the family £(A, B) and shown that the result is consistent with 
the above formula. We view this as providing good evidence for the Koblitz conjecture. Equa- 
tion (10), together with our Theorem 1.3, shows that we obtain consistent results (unconditionally) 
when we average the two sides of the equation in Proposition 1.6 over the primes N < x. Thus all 
of the conjectures and conditional theorems mentioned above reinforce one another's validity. 

The remainder of the article is organized as follows. We begin by establishing Theorem 1.2 in 
Section 2. Briefly, we approximate the function K* (N) by a similar function whose values depend 
only upon the small primes dividing N and N — 1; we then calculate the average value of this 
truncated function by partitioning the numbers being averaged over into "configurations" based 
on their residue classes modulo powers of these small primes. We prove the related Theorem 1.3 

5 



in Section 3; here the calculation of the main term is simpler since the argument of K* is always 
a prime, while the estimation of the error term is more complicated due to the need to invoke 
results on the distribution of primes in arithmetic progressions. Finally, we establish Theorem 1.4 
in Section 4 by studying the moments of K* . 

Notation. As above, we employ the Landau-Bachmann o and O notation, as well as the associated 
Vinogradov symbols <C, 3> with their usual meanings; any dependence of implied constants on 
other parameters is denoted with subscripts. We reserve the letters £ and p for prime variables. For 
each natural number n, we let P(n) denote the largest prime factor of n, with the convention that 
P(l) = 1. The natural number n is said to be y -friable (sometimes called y-smooth) if P(n) < y. 
We write ^(x, y) for the number of y-friable integers not exceeding x. By a partition of a set S, 
we mean any collection of disjoint sets whose union is S; we do not require that all of the sets in 
the collection be nonempty. 

2. The average of K* over odd integers 

For notational convenience, set R(N) := N/<f)(N), so that K*(N) = K(N)R(N). By defini- 
tion, K(N) is a product over primes, while R(N) = Ylemi^- ~ V-0 -1 can a ^ so be viewed as such a 
product. Moreover, it is the small primes that have the largest influence on the magnitude of these 
products. This suggests it might be useful to study the truncated functions K z and R z defined by 

Kz{N) := S( 1_ J-i)=>(p+i)) n ^-^(p-i)) n 



p\N x yjr ' yjr ' 7 p\N x r yjr p|W 

P<z ^u p (N) 2|i/ p (AT) 

p<z p<z 



and 



^(iV) :=n(i-vp) _i - 



p|JV 
p<z 

Indeed, our Theorem 1.2 concerning the average of K(N)R(N) is easily deduced from a corre- 
sponding estimate for the mean value of K Z (N)R Z (N): 

Proposition 2.1. Let x > e 20 , and set z := ^loga;. We have 

Y,K z {N)R z {N)= l -x + 0{x^). 

N<x 
2\N 

We will establish this proposition at the end of this section (it follows upon combining Lem- 
mas 2.7 and 2.8). At this point, we show how Theorem 1.2 can be deduced from the proposition. 

Proof of Theorem 1.2, assuming Proposition 2.1. It suffices to show that with z — ^ log a;, 

\K Z (N)R Z (N) - K(N)R(N)\ < x/z. (11) 



E 

iV<a; 
N odd 



Now < K(N) < K Z (N) < 1 and < R Z (N) < R(N), so that 

\K Z (N)R Z (N) -K(N)R(N)\ < \K Z (N)\\R Z (N) - R(N)\ + \K Z (N) - K(N)\R(N) 

< (R(N) - R Z (N)) + (K Z (N) - K(N))R(N). 

6 



Thus, it is enough to show that the sums up to x of R(N) - R Z {N) and (K Z (N) - K(N))R(N) 
are also « i/z. As we are looking only for upper bounds, we may extend these sums over all 
N < x and not only odd N . 

Write R{N) = J2 d \ n g(d) for an auxiliary function g. By a straightforward calculation with the 
Mobius inversion formula, we see that g vanishes except at squarefree integers d, in which case 
g(d) = l/<f){d). Hence, for all real t > 0, 

N<t N<t d\N d<t YK ' N<t 

d\N 

<T— 

- tr #w 

- tim) 

so that -R(iV) is bounded on average. Now writing R Z (N) = J2d\n 9z(d) f° r an auxiliary function 
g z (d), one finds that </ 2 vanishes except on squarefree ^-friable integers d, in which case again 
g z (d) = l/(j)(d). In particular, g(d) — g z (d) is nonnegative for all d, and g(d) — g z {d) = when 
d < z. We deduce that 

J2 (R(N) - R,(N)) = E J2(9(d) - < E E ^ 

iV<a; Af<x d\N N<x d\N ^ ' 

d>z 

— > 1 ^ — \ X 

2.^ 7Ja\ - 



<f>(d) ~ ^ d<p(d) ' 

z<d<x N<x YK ' d>z rv ' 
d\N 

Partitioning this last sum into dyadic intervals, we have 

E<*w -«.<*»<£ E wz = *± E ^ 

Af<2: fe=l 2 fc - 1 2<d<2 fc z V 7 k=l 2 k - 1 z<d<2 k z 

OO 

^Et^i^ E^) 

k=l V ; d<2* 2 

oo 

«-Ep^ 2 ^ 

oo 

X \ ^ 1 X 

< -E^ < -> 

fc=i 

where we used the estimate (12) in the second-to-last inequality. This proves the desired upper 
bound for the partial sums of R(N) — R Z (N). 

The partial sums of (K Z (N) — K(N))R(N) are easier. Since each factor appearing in the 
products defining K z and K has the form 1 - 0(l/£ 2 ), it follows that K{N)/K Z {N) > 1 - 
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O (Ee> z V^ 2 ) > 1 - 0(l/z). Thus, K Z (N) - K(N) = K Z (N)(1 - K{N)/K Z {N)) < 1 - 
K{N)/K Z {N) < 1/z. It follows that 

J2(K Z (N) - K(N))R(N) « i £ R(N) « ^ 

N<x N<x 

using the estimate (12) once more in the last step. This completes the proof of Theorem 1.2, 
assuming Proposition 2.1. □ 

In the remainder of this section, we concentrate on proving Proposition 2.1. Our strategy, already 
alluded to in the introduction, is to partition the odd N < x according to local data at small primes. 
We choose the partition so that the values K Z (N) and R Z (N) are constant along each set belonging 
to the partition (which we call a configuration). For the remainder of this section, we continue to 
assume that x > 2 and that z — logx. 

Definition 2.2. We define the configuration space y as the set of all 6-tuples of the form 

(A, £>i, £> 2 , C, {ee}e e BiUB 2 y W}feB 2 )> 

where the sets A, Bi, £> 2 , C partition the set of primes up to z, the are positive integers, and each 
at satisfies 1 < a e < £ — 1. (Although 5? depends upon z and hence x, we will not include this 
dependence in the notation.) 

To each odd N < x, we can associate a unique configuration in the following manner. 

Definition 2.3. Given an odd N < x, define three subsets of the primes in [2, z] by setting A : = 

{£ < z : t \ N(N - 1)}, B := {£ < z : £ \ N}, and C := {£ < z : I \ N - 1}. For each i G B, set 
ee := fe(N), and let 

#i := {£ G B : 2 \ e e } and B 2 := {£ e B : 2 \ e e }. 
For £ G B 2 , choose ae G [1, £ — 1] so that 

N/t l = a e (mod £). 

Then a = (A, B\, B 2) C, {e^}£ eBlUB2 , {a e } eeB2 ) G 5? is called the configuration a corresponding 
to and is denoted a N . 

Remark. One checks easily that the value K Z (N)R Z (N) depends only on a = a N . Thus, we often 
abuse notation by referring to K z (a) and R z (a) instead of K Z (N) and R Z (N). 

The condition that N is odd corresponds exactly to the condition that 2 G C. Hence, we can 
rewrite the sum considered in Proposition 2. 1 in the form 

£ K Z {N)R Z {N) = K-M)R z {a) £ 1. (13) 

N<x N<x 
2\N 2eC a N =a 

In the next lemma, we estimate the inner sum on the right-hand side of (13) in two ways. 
Lemma 2.4. For each a G with 2 G C, we have 

J2 1 = d a x + 0(x 1 ^ 5 ), (14) 

N<x 
<r N =a 
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where 



i. = ( na - 2/0) ( n ^7(1 - ( n ^r) ( n 7) • a» 



We a/50 /zave ?/ze erode upper bound 



j2 i ^ x n re< (i6) 



7V<z £eSiUS 2 

(TJV=0- 

/or any a G =5^. 

Proof. The condition that o^r = a is equivalent to a congruence condition on N modulo 

m a := 11 £ 11 (17) 
eeAuc £eBiuB 2 

Indeed, a N = a precisely when N belongs to a union of ~~ ^) rifeBiC^ — 1) congruence 

classes modulo m a . This implies that 

E^n^-w-u+of n «)=<u+o(n<Y 

N<x ° ieA ieBi K ieAuB 2 7 v ^<z 7 

(TJV=(T 

By our choice of z and the prime number theorem, Yle< z ^ < f° r l ar g e ^ an d so we have 
established the formula (14). To justify the inequality (16), it suffices to observe that if a N = a, 
then lL eBl uB 2 te divi des N. □ 

The modulus m a , defined in (17), will continue to play a key role in subsequent arguments. It 
will be convenient to know that m a nearly determines a; this is the substance of our next result. 

Lemma 2.5. For each natural number m, the number of a e 5? with m a = m is 0(x 1 ^ 4 ). 

Proof. Suppose that m a = m, where a = (A, B-y, B2, C, {ee}e e B 1 uB 2 , { a e}eeB 2 )- Since the sets 
A,Bi,B 2 ,C partition the primes up to z, the number of possibilities for these sets is = 
exp(0(log:r/loglog:r)) = x°^\ Having chosen these sets, the exponents e e , for £ e B, are 
determined by the prime factorization of m. Finally, since each at is between 1 and £ — 1, the 
number of choices for the ae is at most \~[ eeB2 (£ — 1) < \~[ f <z £ < x 1 ^ 5 . Collecting these estimates 
establishes the lemma. □ 

We next investigate two sums over m a for future use in estimating error terms. 
Lemma 2.6. For each a e S?, define m a by (17). Then for all x > 3, 

x 6/5 log log x V — + a; 1/5 log log x V 1 < x 3/4 . (18) 
2ec 2ec 

rn a >x m a <x 



Proof. We proceed by Rankin's method: 



x 6/5 log log x y — + x i/5 log log x y i 



m a >x m a <x 



< x 6 / 5 log log x ( — J — + x 1 / 5 log log x y [ — J 

m a >x m a <x 

= x 13 / 40 loglogx^4 



1/8' 

2ec 



a^y m ° 



Every value of m a is ^-friable, and there are at most x 1 / 4 configurations a E 5? for every possible 
value of m a by Lemma 2.5. Therefore 



x 13/40 



log log xj^ ^x^loglogx-x 1 / 4 ^ 

m 2-friable 

io g iogxn(i+i+-^+---) 
x-'-iogiog^n^i-^) • 



o-e^ m z -friable 

2SC 

= x 23/40 

P<Z 

„23/40 ■ 



Each factor in the product is at most (1 — 2 1 / 8 ) 1 < 13, and so the product is less than 13^^ = 
13 o(iogx/iogiog^) = x o(i)_ Thus the left-hand side of equation (18) is < x 23 / 40 ^ 1 ) log log a; < x 3 / 4 
as claimed. □ 



The next lemma relates the mean value of K Z (N)R Z (N), taken over odd N, to the sum of 
K z (a)R z (a)d a , taken over all configurations a with 2 £ C. 

Lemma 2.7. For all x > 2, 

K Z (N)R Z (N) =xJ2 K z {a)R z {a)d a + 0(x 3 / 4 ). 

N<x aey 
2\N 2eC 

Proof. We begin by noting that the upper bounds 

< K(N) < K Z (N) < 1 and < R Z {N) < R(N) < J[ (l - < log log x (19) 

p<x ^ ^ ' 
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are valid for all iV < x. We write 

Y,K Z {N)R Z {N) = Y J K z (a)RzW) 1 

N<x ae,y N<x 

2\N 2eC a N =a 

= K z (cr)R z (cr) K z (a)R z (a) £ 1 

aey n<x aey N<X 

2eC a N =<r 2SC a N =<r 

m a <x m a >x 

= Y, K ^) R M)(d a x + 0(x l ^)) + o(Ys K ^ a ) R ^ a ) x II r 

2eC 2eC 

m CT <a; m a >x 

by Lemma 2.4. Using the upper bounds (19) for K z and R z , we deduce after extending the first 
sum to infinity that 

K Z (N)R Z (N) = xJ2 K z (a)R z (a)d a + o(x\og\ogx ^ 

N<x o-ey ^ aey ' 

2\N 2eC 2eC 

m a >x 



+ o ( x i/s log log x j2 1 + x io s io s * n re j ; 



2ec 2ec 

•m a <x m a >x 



since the inequality d a < Y\ ieBiyjB2 £~ ei follows from the definition (15), the first error term is 
dominated by the second. Because ILeSiu^ ^~ &t = Yie<z ^ < m^x 1 ^ once x is large, this 
error term is <C x 3//4 by Lemma 2.6, and the proof is complete. □ 

In view of Lemma 2.7, Proposition 2.1 is a consequence of the following remarkable identity: 

Lemma 2.8. Assume that x > e 20 , so that z > 2. Then 

Y,K z {a)R z {a)d a = ]-. 

aey 

2GC 

Proof. Referring back to the definitions of K z and R z , we see that for a G y, 



(^-i)v; vy v (^-i) 2 (^+i) 



X 



Multiplying by the expression (15) for we find that 

\ieA " J \e<=c 



1 

I - -. 



n<«, ( : ««<«-i))) (n <2o) 
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Recall that a is a 6-tuple with entries A, B\, B 2 , C, {ee}eeBiUB 2 ^ an ^ {o-e}eeB 2 - We hold the first five 
of these fixed and sum over the possibilities for {at} = {a^}^ 6 s 2 - The resulting sum factors: 



{a e } v feB 2 
each a(E{l,2,...,£— 1} 



n ( s *c.^ _ -n ( 1 



, 1) l 1 ~ 1) 

Since El7i frO = °' we obtain 



which matches the factor corresponding to B\ in equation (20). Thus, fixing only A, B\, B 2 , C, 
{ee}eeBiuB 2 but not any longer {a f }^ B2 , summing over all corresponding a e 5? gives 

t,Bi,S 2 ,C,{e < } fixed 

Next, we sum this expression over the possibilities for {e^} = {e£}^ eBlUB2 . We have 



each > 1 



By a short computation, 



(€-!),/ (€ + !)(€ -I) 2 ' 



Thus, if we now fix only the sets A, B := B\ U i3 2 , and C and sum over all corresponding configu- 
rations a, we have 



AS,C fixed 



7 v feB 7 v £ec 
where for notational convenience we have defined 



To finish the proof, we sum the right-hand side of equation (21) over all possibilities for A, B, 
and C. The only conditions on the sets A, B, and C are that 2 belongs to C and that they partition 



the set of primes not exceeding z. Hence, 

E KM**{°)i* = E (U P ^) (U p ^)) (U p cw) 

ae.y A,B,C disjoint ^ £e.4 ' ^ teB ' ^ £ec ' 

2ec ^uz3uc={^<^} 

2GC 

=p c (2) n (p^)+p b w+pcw). 

2<£<z 

(We use here that z > 2.) However, P A (£) + P B (£) + P c (f) = 1, identically! Thus, the right-hand 
side collapses down to simply Pc(2) = |. This completes the proof of the lemma, and so also of 
Proposition 2.1. □ 

Most mathematical coincidences have explanations, of course, and the magical- seeming P A (£) + 
Pb(£) + Pc(£) = 1 is no different. One might guess that P^(£), Pb(£), and Pc(£) are probabilities 
of certain events occurring, and this is exactly right: as 7 ranges over all elements of GL 2 (F£), 
the expression det(7) + 1 — tr(7) is congruent to (mod £) with probability Pb(£), congruent to 
1 (mod£) with probability Pc(£), and congruent to each of the £ — 2 other residue classes with 
probability P A (£)/(£ — 2). (See [5, equation (2.2)] for this computation, as well as for the precise 
connection to elliptic curves.) 

3. The average of K* over primes 

In this section we establish Theorem 1.3. The main component of the proof is the following 
asymptotic formula for the sum of the multiplicative function F evaluated on shifted primes. 

Proposition 3.1. Let F be the multiplicative function defined in equation (6), and let J be the 

constant defined in equation (3). For any x > 2 and for any positive real number A, 

E F(p - 1) = Jn(x) + A {x/{\ogx) A ). 

2<p<x 

Proof. Write F(n) = J2d\n d(^) f° r an auxiliary function g (not the same function as in the proof 
of Theorem 1.2), which is also multiplicative. By a direct computation with the Mobius inversion 
formula, g vanishes unless d is odd and squarefree, in which case 

Writing -k(x\ d, 1) for the number of primes p < x with p = 1 (mod d), we have 
E F(P ~ 1) = E E 9(d) = -9(1) + E E 9(d) 

2<p<x 2<p<x d\p— 1 p<x d\p— 1 

= E 9(d)n(x;d,l)+ E 9(d)7r(x;d,l) + 0(l). (24) 

d<(\ogx) A (\ogx) A <d<x 

We first consider the second sum on the right-hand side. Trivially, n(x; d, 1) < x/d, and so 

E 9(d)n(x; d, 1) < x E ^ ( 25 ) 

(log x) A <d<x d>(\ogx) A 
13 



When g(d) is nonvanishing, the formula (23) yields 

02 / 1 \ -1 



n ra«n(i-i) 



0(d)' 



and hence <C l/d(p(d) for all values of d. In particular, using the crude lower bound <p{d) 3> 
d 1 / 2 (compare with the precise [13, Theorem 2.9, page 55]), we find that g(d) <C d~ z l 2 . Thus 
equation (25) gives 

Y g(d)n(x;d,l)<z:x Y d~ 5/2 x(\ogx)- 3A/2 , 

(logx) A <d<x d>(logx) A 

and so equation (24) becomes 

J2F(p-l)= Yl 9(d)7r(x-d : l) + 0(x(\ogx)- 3A / 2 ). (26) 

2<p<x d<(logx) A 

To deal with the remaining sum, we invoke the Siegel-Walfisz theorem [13, Corollary 11.21, 
page 381]. That theorem implies that for a certain absolute constant c > 0, 

Y g(d)n(x; d,l)= Y 9(d) (j^- + A (x exp(-cy/logx)) 

d<{logx) A d<(logx) A ' 



n ^ ^ + A ( K xexp(-c v ^)f2g(d)^ 



d<(\ogx) A ^ V ' 

^ g(d) 

n ( x ) Ju) + x expl-c^og x ) Y g ^ 

d>{\ogx) A ^ ' d=\ 

In the error term, we again use the crude bounds g(d) <C dr 3 ! 2 and (j){d) 3> d 1 / 2 , obtaining 
Y 9(d)7t{x; d, 1) = tt(x) Yj(J\ + A {ir(x)(\ogx)- A + x exp(-cy^g^) • l), 

d<(logx) A d=l ^ > 

whereupon equation (26) becomes 

Y F( P -i) = n(x)Y^§- + o A (x(\ogxr A ). 

2<p<x d=l ^ ' 

Finally, the constant in this main term is a convergent sum of a nonnegative multiplicative function, 
and hence it can be expressed as the Euler product 

j^m u v <p(p) <p(p 2 ) J l\\ (t-m -2)(<+i) ) 

by equation (23). This completes the proof of the proposition. □ 

14 



Proof of Theorem 1.3. We first claim that the asymptotic formula (4) for K* follows easily from the 
same asymptotic formula for K. Indeed, for each odd prime p, we have K* (jp) = K(p)p/(p — 1) = 
K(p) + 0(K(p)/p). Because each local factor in Definition 1.1 is of the form 1 + 0(p~ 2 ), we see 
that K is absolutely bounded. Thus 

£ K*(p)= K(p) + o( -) = E K(p) + 0(\og\ogx), 

2<p<x 2<p<x \2<p<x^' 2<p<x 

and so it suffices to establish the asymptotic formula (4) for K. 

For each odd prime p, the decomposition (5) gives K(p) = |C 2 F(p — l)G(p), where F and G 
are defined in equations (6) and (7), respectively. Again, all local factors in these definitions are of 
the form 1 + 0(p~ 2 ); hence G(p) = 1 + 0(l/p 2 ) and F is absolutely bounded. Therefore, 

Y,K(p) = K(2) + l C * F (P ~ ^(P) 

p<x 2<p<x 



C2Ef(p - 1)+ o(i + E^) 



2 
3 

2<p<x N 2<p<x 

2 



jC 2 E F(P-1)+0(1), 

2<p<x 

and so the desired asymptotic formula (4) is a direct consequence of Proposition 3.1. □ 

4. The distribution function of K* 

The goal of this section is to establish the existence of the distribution function of K* (N) relative 
to the odd integers. We do so by bounding the following moments of K*(N): 

fi k := lim \ V K*(N) k (27) 

™ X / 2 kx 
N odd 

We describe below how Theorem 1.4 follows from Proposition 4.2. Before we can bound these 
moments, however, we must prove that the moments even exist. In Theorem 1.2 we determined 
that pi = |, and the same method of determining p k applies in general. 

Proposition 4.1. For every natural number k, the limit (27) defining p k exists. 



Proof. Following the proof of Proposition 2.1, we obtain (with minimal changes to the argument) 
that for each fixed k, 

J2 (K z (N)R z (N)) k = xJ2 K z {a) k R z {a) k d a + O k (x 3 ^), (28) 

N<x aey 
N odd 2eC 

where z = ^ log x and d a is defined in equation (15). Note that for N < x, 

(K z (N)R z (N)) k - (K(N)R(N)) k 

< fe max{^(iV) J R(iV),^(iV) J R,(iV)}^ 1 • \K(N)R(N) - K Z (N)R Z (N) \ 
« fc (loglogx)^ 1 • \K(N)R(N) - K Z (N)R Z (N)\ 
by the bounds in equation (19); therefore 
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J2 K*(N) k = J2 (K z (N)R z (N)) k + f J2 ((K(N)R(N)) k - (K z (N)R z (N)) k ) 

N<x N<x ^ N<x 

N odd N odd N odd 

(K z (N)R z (N)) k + O k ( (hglogx)"- 1 \K(N)R(N) - K Z (N)R Z (N)\ 



N<x x N<x 

N odd N odd 

Using equation (28) in the main term and the estimate (1 1) in the error term, we obtain 

K*(N) k = xY, K z (a) k R z (a) k d a + O k (x^ + (log log x^x/z) 
n<x ae,y 

N odd 2eC 



x J2 K z (a) k R z (a) k d a + O k (^(loglog^Y 



2GC 



Dividing both sides by x/2 and passing to the limit, we deduce that 

li k = 2 lim V K z (a) k R z (a) k d a , (29) 

x— >oo ^— ' 

2eC 

provided that this limit exists. 

To compute the sum over a in (29), we follow the proof of Lemma 2.8; however, the details 
are significantly messier. With the six components A, B±, B 2 , C, {e^}^ eBlU g 2 , {ae}eeB 2 of o as 
before, we write down the expansion for K z (a) k R z (a) k d a analogous to (20). This expansion 
has four components, which are products over primes i in A, £>i, B 2 , and C. The B\ product 
depends additionally on the tuple {e^} teBl , while the product over B 2 depends on both {e^} teB2 
and {ae}e & B2- We sum over all possibilities for a e e [1,£ — 1] to remove the final dependence, 
and then we sum over all possibilities for even natural numbers {e^ G B 2 and odd natural numbers 
{e^eBj- After straightforward but uninspiring computations, we find that fixing only A, £>i, B 2 , 
and C, 

j2K Z (a) k R Z (a) k d a = (n^w) (n^w) (ii p ^)) (n^w 

where (we suppress the dependence on k in the notation on the left-hand sides) 



p A (i) = (i- 2 7 ) k+l (i- l 7 r 2k , 



if i\ t 1 \ 

^ = \ t j~ d (1 - ]) ^ ( " ^r) " + " ^^1) ) " ) ■ (30) 



= 7(1- 



(^-i) 2 (^ + i), 

(Note that when k = 1, these expressions reduce to the expressions in equation (22), at least after 
we set Pb(£) = Pb^) + Pb 2 (^)-) To compute the sum appearing in (29), we sum over A, B\, 
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B 2 , and C. The only conditions on A, B\, B 2 , and C are that 2 belongs to C and that the four sets 
partition the primes in [2, z\. So if x > e 20 (so that z > 2), 

K z (a) k R z (a) k d a = P c (2) J] (P A (£) + P Bl (£) + PbM + M?)) , 

and so from equation (29), 

^ fc = 2P C (2) J] (Pa(^) + P Bl (£) + PbM + Pc(£)) ■ (3D 

e>2 

It remains to show that this product converges. From their definitions (30), we find that 

p A (e) = i-2/e+o k (i/e 2 ), 

P Bl (£) = l/£ + O k (l/f), 

p B2 (£) = o k (i/e 2 ), 

P c (£) = l/£ + O k (l/f). 

It follows that each term in the product from equation (31) is 1 + 0(1 /£ 2 ); consequently, that 
product converges, which completes the proof of the proposition. □ 

Remarks. 

(i) For any given k, we can explicitly compute P A , P Bl , P B2 , and P c and thus write down an 
exact expression for /j, k as an infinite product over primes. For example, taking k = 2, we 
find that 

^ - 9 I! (l + £ { £- m £ + l)^ + P + l) ) - °- 47739 - 

(ii) We can also compute the moments of K*(N) over all natural numbers N, not only odd N. 
Minor adjustments have to be made to the argument to account for the fact that the Kro- 
necker symbol (£) is periodic modulo 8 rather than modulo 2; however, the difficulties are 
all notational, not theoretical. After making the necessary changes, one finds that the kth 
moment of K*(N) is 

J] (P A (£) + P Bl (£) + P B2 (£) + P c {£)) 

t 

using the same definitions (30). 
Proposition 4.2. The moments \x k satisfy log ji k <^k log log k. 

Remark. Proposition 4.2 implies in particular that Hindoo ^fJ 1 J 2k = 0. It is well known from 
the theory of probability (see, for example, [6, Theorem 3.3.12, page 123]) that the weaker con- 
dition limsupfc^^ j k fJ 1 k 2k < oo implies the existence of a distribution function. So to establish 
Theorem 1.4, it indeed suffices to prove Proposition 4.2. 

Proof. Recall that R(N) denotes the function N/(f>(N). We also let l odd denote the characteristic 
function of the odd natural numbers. The number /j, k is twice the kth moment of the function 

17 



lodd-K (N)R(N), and that function is bounded pointwise by R(N). So fik is bounded above by 
2/i' fc , where 

//, := lim - £ R(N) k - 

N<x 

Thus, it suffices to establish the estimate log fi' k <C k log log k. 

By a result known already to Schur (see [14, page 194]; see also [13, Exercise 14, page 42]), we 
have that for each k, 

By the mean value theorem, 



p\\p— iy / \p(p — i) \p — i 



=1+0-1+ - 

y\ p - 1 

, ... fc fk-l 
< 1 + O — exp 



p2 \p 



and so 



4< n( 1+0 (ie XP (^)))n( 1+ o(i exP (^))). (32) 



p<fe x x x ' p>fc 

In the first product, we use the crude inequality 

1 + 0(4 exp ( \ J j < 1 + 0( A; exp ( — ^— J ] < A: exp ^ 



^2 ^ _ i j j y - — i y y 1 \ v p — i 

so that for some absolute constant O, 

n( 1+0 (?->(£)))*n°*->fe 



< (CA;) 7 ^ exp ( Jfe ^ — ) 



= exp(0(/c))exp(0(/cloglogA;)). 
In the second product, the exponential factor is uniformly bounded, and so 



nMM^)+n(+l)) 

<n t H+))) 

^ ex p (°(E^))= ex p(°w)- 

In light of these last two estimates, equation (32) yields fi' k < exp(0(k log log k)) as required. □ 
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Remark. There is an alternative, more arithmetic approach to the proof of Theorem 1 .4, based 
on ideas and results of Erdos [7] and Shapiro [15]. This approach allows us to show that the 
distribution function D{u) of Theorem 1.4 is continuous everywhere. Moreover, if u := |C 2 , 
then D(uq) = and D(u) is strictly increasing for u > u . We omit the somewhat lengthy 
arguments for these claims. 
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